[SPARK-12343][YARN] Simplify Yarn client and client argument by jerryshao · Pull Request #11603 · apache/spark

jerryshao · 2016-03-09T09:05:35Z

What changes were proposed in this pull request?

Currently in Spark on YARN, configurations can be passed through SparkConf, env and command arguments, some parts are duplicated, like client argument and SparkConf. So here propose to simplify the command arguments.

How was this patch tested?

This patch is tested manually with unit test.

CC @vanzin @tgravescs , please help to suggest this proposal. The original purpose of this JIRA is to remove ClientArguments, through refactoring some arguments like --class, --arg are not so easy to replace, so here I remove the most part of command line arguments, only keep the minimal set.

jerryshao · 2016-03-09T09:06:40Z

yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala

Do we still need to support environment variables?

No as I pointed out the other jira SPARK-3374 was proposing to remove. There might be some internal env variables used just to communicate between Client and AM but anything external can be removed for 2.x

I think this could be done in SPARK-3374 as a separate patch to clean up all the internal env variables.

that would be ok, but I think you have already removed most of them, it looks like there are only a couple left in this file so it might be just as easy to do here.

SparkQA · 2016-03-09T11:22:46Z

Test build #52743 has finished for PR 11603 at commit 0c0a95c.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

tgravescs · 2016-03-09T14:20:47Z

Hmm, it looks like this overlaps a lot with https://issues.apache.org/jira/browse/SPARK-3374 which is to deprecate the env variables, remove deprecated configs, and remove the client args. Which someone is actively working on too, so I guess that kind of stinks to duplicate the work.

Taking a quick skim this seems like you did everything proposed in that so we might as well dup these. Or actually why don't you just update this PR to have both jira in description.

tgravescs · 2016-03-09T14:28:26Z

core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala

Removing this and letting it use spark.jars uses the spark addJars method of distributing jars instead of using the yarn distributed cache like it did before. I'm not sure I want to change that, I would rather keep as much as possible consistent using the YARN distributed cache unless we have a reason to change.

If this means we need to add another config like spark.yarn.dist.jars I'm fine with that. We can leave it as undocumented for now if we want since spark.jars/spark.files isn't documented either, thought I'm not quite sure why.

Yeah, we shouldn't lose the distributed cache functionality.

On that topic, but probably orthogonal to this change, at some point I'd like to see yarn-client somehow also use the distributed cache for these jars. IIRC that doesn't happen at the moment.

(Reading the rest of the change, if you figure out an easy way to have YARN's Client class handle spark.jars, you could solve both problems here... maybe some check in SparkContext to not distribute them in YARN mode?)

I see, thanks a lot for your suggestions.

tgravescs · 2016-03-09T17:03:07Z

Personally I'm fine with leaving some parameters in ClientArguments as long as they don't overlap with configs. The main thing is to not have to process the same config/setting multiple times. Here we either need the parameters or add in new configs. Since these are all private we can change it later if needed.

tgravescs · 2016-03-09T17:03:54Z

Note I think the biggest thing here is to thoroughly test this to make sure all the various options work in both modes (cluster/client) with all spark-submit, pyspark, spark-shell, sparkr.

vanzin · 2016-03-09T19:02:19Z

yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala

Do we need to pass these as explicit command line arguments to the AM? Since we're simplifying code, might just take the chance and make the AM read these from the configuration too.

@vanzin , so you mean we also need to simplify the command line argument for AM?

vanzin · 2016-03-09T19:05:31Z

Part of SPARK-12343 was to make object Client private, which is missing from this change.

vanzin · 2016-03-09T19:07:19Z

yarn/src/main/scala/org/apache/spark/deploy/yarn/config.scala

This is a core config, so should be declared in core/src/main/scala/org/apache/spark/internal/config/package.scala instead. Same for EXECUTOR_MEMORY, PY_FILES, and probably also EXECUTOR_CORES, although not completely sure about that last one.

I see, will change it.

jerryshao · 2016-03-10T10:10:53Z

core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala

Here I put all the additional jars into a configuration spark.yarn.dist.jars, this will be picked by yarn/client and put into distributed cache. So now both in yarn client and cluster mode, additional jars will be put into distributed cache.

Another thing is that do we need to put user jar into distributed cache for yarn client mode, I think it is doable, not sure is there any special concern?

I think we should just leave that as is for now. We can file separate jira if we want to change.

So dist.files and dist.archives are public and documented, seems like we should make dist.jars public and document it also in the yarn docs unless someone has reason not to.

Sure, I will add it to the yarn doc.

Looks like there's another config "spark.jars" to handle this property, maybe we don't need to add another, and for dist.jars we could make it as internal use for yarn only.

spark.jars is for distributing via spark internal mechanisms, this is done via the distributed cache, we should add it to the yarn only section of docs similar to the dist.files and dist.archives.

SparkQA · 2016-03-10T12:20:25Z

Test build #52825 has finished for PR 11603 at commit 8532ee7.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-03-17T11:31:20Z

Test build #53412 has finished for PR 11603 at commit 40c1a19.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-03-21T09:55:07Z

Test build #53664 has finished for PR 11603 at commit 7f41403.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-03-22T09:04:38Z

Test build #53751 has finished for PR 11603 at commit 6d3c62d.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

tgravescs · 2016-03-23T14:58:46Z

we should remove SPARK_LOG4J_CONF functionality in Client.scala

have you looked at test failures?

jerryshao · 2016-03-24T00:58:19Z

OK, I will remove this old one. Looks like this unit test is not related.

jerryshao · 2016-03-24T08:27:30Z

Another environment is SPARK_JAVA_OPTS, it is deprecated since 1.0, but lots of places use this variable, do we need to remove this one?

SparkQA · 2016-03-24T10:48:50Z

Test build #54025 has finished for PR 11603 at commit f9b62a1.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

tgravescs · 2016-03-24T14:51:05Z

We should file a separate jira for the SPARK_JAVA_OPTS since its used in core and other places.

vanzin · 2016-03-25T18:52:21Z

core/src/main/scala/org/apache/spark/internal/config/package.scala

  private[spark] val DRIVER_USER_CLASS_PATH_FIRST =
    ConfigBuilder("spark.driver.userClassPathFirst").booleanConf.withDefault(false)

+    private[spark] val DRIVER_MEMORY = ConfigBuilder("spark.driver.memory")


nit: indentation is wrong

vanzin · 2016-03-25T19:15:10Z

Mostly good, just a few comments.

SparkQA · 2016-03-29T09:52:27Z

Test build #54420 has finished for PR 11603 at commit d152f9f.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-03-30T08:10:17Z

Test build #54499 has finished for PR 11603 at commit 7feae6e.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

vanzin · 2016-03-30T18:17:31Z

yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala

-      (args.addJars, LocalResourceType.FILE, true),
-      (args.files, LocalResourceType.FILE, false),
-      (args.archives, LocalResourceType.ARCHIVE, false)
+      (sparkConf.get(JARS_TO_DISTRIBUTE).orNull, LocalResourceType.FILE, true),


minor: I think you could simplify this a little bit by listing just the options (instead of sparkConf.get(option).orNull); if you made the config entries lists defaulting to Nil (adding toSequence to their builders, and using withDefault instead of optional) you could save even more code below.

vanzin · 2016-03-30T18:18:09Z

LGTM, I just left a really minor suggestion to save a few lines of code but no need to address that right now.

SparkQA · 2016-03-31T07:24:47Z

Test build #54596 has finished for PR 11603 at commit 3bb44b4.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

vanzin · 2016-04-01T17:51:54Z

Merging to master, thanks!

jerryshao changed the title ~~[SPARK-12343] Simplify Yarn client and client argument~~ [SPARK-12343][YARN] Simplify Yarn client and client argument Mar 9, 2016

jerryshao reviewed Mar 9, 2016
View reviewed changes

tgravescs reviewed Mar 9, 2016
View reviewed changes

vanzin reviewed Mar 9, 2016
View reviewed changes

jerryshao force-pushed the SPARK-12343 branch from 0c0a95c to 8532ee7 Compare March 10, 2016 10:06

jerryshao reviewed Mar 10, 2016
View reviewed changes

jerryshao force-pushed the SPARK-12343 branch from 8532ee7 to 40c1a19 Compare March 17, 2016 09:16

jerryshao added 5 commits March 21, 2016 15:10

Simply the functionality of Yarn ClientArguments

9311622

Fix test failure

f99cb19

continue dealing with properties

e27a5a0

Address the comments

75c5b36

Fix rebase issue and add unit tests

7f41403

jerryshao force-pushed the SPARK-12343 branch from 40c1a19 to 7f41403 Compare March 21, 2016 07:31

Minor fix

6d3c62d

remove old log4j environment

f9b62a1

vanzin reviewed Mar 25, 2016
View reviewed changes

jerryshao mentioned this pull request Mar 28, 2016

[SPARK-14062][Yarn] Fix log4j and upload metrics.properties automatically with distributed cache #11885

Closed

Address the comments

d152f9f

Add spark.yarn.dist.jars to doc and fix minor issue

7feae6e

vanzin reviewed Mar 30, 2016
View reviewed changes

minor fix

3bb44b4

asfgit closed this in 8ba2b7f Apr 1, 2016

Conversation

jerryshao commented Mar 9, 2016

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Mar 9, 2016

Uh oh!

tgravescs commented Mar 9, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tgravescs commented Mar 9, 2016

Uh oh!

tgravescs commented Mar 9, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vanzin commented Mar 9, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SparkQA commented Mar 10, 2016

Uh oh!

SparkQA commented Mar 17, 2016

Uh oh!

SparkQA commented Mar 21, 2016

Uh oh!

SparkQA commented Mar 22, 2016

Uh oh!

tgravescs commented Mar 23, 2016

Uh oh!

jerryshao commented Mar 24, 2016

Uh oh!

jerryshao commented Mar 24, 2016

Uh oh!

SparkQA commented Mar 24, 2016

Uh oh!

tgravescs commented Mar 24, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vanzin commented Mar 25, 2016

Uh oh!

SparkQA commented Mar 29, 2016

Uh oh!

SparkQA commented Mar 30, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vanzin commented Mar 30, 2016